Noisy-to-Noisy Voice Conversion Under Variations of Noisy Condition

نویسندگان

چکیده

Voiceconversion (VC) refers to the transformation of speaker identity a speech target one without altering linguistic content. As recent VC techniques have made significant progress, implementing them in real-world scenarios is also considered, where data some inevitable interferences, most common which are background sounds. On other hand, sounds informative and need be retained applications, such as movies/videos. To address these issues, we proposed noisy-to-noisy (N2N) framework that does not rely on clean models noisy directly by using noise conditions. Previous experimental results proven its effectiveness. In this article, further improve performance introducing pre-trained noise-conditioned model. Moreover, explore impacts conditions, more realistic situations evaluated training set possesses speaker-dependent The demonstrate effectiveness pre-training strategy degradation under strict We then augmentation method overcome limitation. Further experiments showed method.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modelling a Noisy-channel for Voice Conversion Using Articulatory Features

In this paper, we propose modeling a noisy-channel for the task of voice conversion (VC). We have used the artificial neural networks (ANN) to capture speaker-specific characteristics of a target speaker which avoid the need for any training utterance from a source speaker. We use articulatory features (AFs) as a canonical form or speaker-independent representation of a speech signal. Our studi...

متن کامل

Modeling a Noisy-channel for Voice Conversion Using Articulatory Features

متن کامل

Exemplar-Based Voice Conversion Using Sparse Representation in Noisy Environments

SUMMARY This paper presents a voice conversion (VC) technique for noisy environments, where parallel exemplars are introduced to encode the source speech signal and synthesize the target speech signal. The parallel exemplars (dictionary) consist of the source exemplars and target exem-plars, having the same texts uttered by the source and target speakers. The input source signal is decomposed i...

متن کامل

Bayesian Gates for Reliable Logical Operations under Noisy Condition

The reliability of logical operations is indispensable for the reliable operation of computational systems. Since the down-sizing of micro-fabrication generates non-negligible noise in these systems, a new approach for designing noise-immune gates is required. In this paper, we demonstrate that noise-immune gates can be designed by combining Bayesian inference theory with the idea of computatio...

متن کامل

Identification of Cement Rotary Kiln in Noisy Condition using Takagi-Sugeno Neuro-fuzzy System

Cement rotary kiln is the main part of cement production process that have always attracted many researchers’ attention. But this complex nonlinear system has not been modeled efficiently which can make an appropriate performance specially in noisy condition. In this paper Takagi-Sugeno neuro-fuzzy system (TSNFS) is used for identification of cement rotary kiln, and gradient descent (GD) algori...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing

سال: 2023

ISSN: ['2329-9304', '2329-9290']

DOI: https://doi.org/10.1109/taslp.2023.3313426